MPSearch: Multi-Path Search for Tree-based Indexes to Exploit Internal Parallelism of Flash SSDs

نویسندگان

  • Hongchan Roh
  • Sanghyun Park
  • Mincheol Shin
  • Sang-Won Lee
چکیده

Big data real-time processing aims for faster retrieval of data and analysis. Lately, in order to accelerate real-time processing, big data platforms are trying to exploit NAND flash based storage devices, especially SSDs. NoSQL DBMSs have been used for real-time management of big data which significantly depends on index structures to efficiently manage data. Previous research about flash-aware index structures addressed the potential problems of hard-disk oriented designs. In this paper, we focus on exploiting potential benefits of flash SSDs. First, we examine the internal parallelism of flash SSDs by benchmarking several flash SSDs. Then we present a new I/O request concept, called psync I/O, that can exploit the internal parallelism of flash SSDs in a single process, and we propose a new search method (MPSearch) that enables tree based indexs to exploit the internal parallelism of flash SSDs. Based on MPSearch, we present a B+-tree variant, PIO B-tree (Parallel I/O B-tree). PIO B-tree enhanced B+trees insert performance by a factor of up to 16.3, while improving point-search performance by a factor of 1.2. The range search of PIO B-tree was up to 5 times faster than that of the B+-tree. Moreover, PIO B-tree outperformed other flash-aware indexes in various synthetic workloads. In order to enhance NoSQL DBMS performance on flash SSDs, PIO B-tree can be adopted or MPSearch can be applied to other tree-based index structures adopted in NoSQL DBMSs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

One of the practices I inherited from Won

Big data real-time processing aims for faster retrieval of data and analysis. Lately, in order to accelerate real-time processing, big data platforms are trying to exploit NAND flash based storage devices, especially SSDs. NoSQL DBMSs have been used for real-time management of big data which significantly depends on index structures to efficiently manage data. Previous research about flash-awar...

متن کامل

AS B-tree: A Study of an Efficient B+-tree for SSDs

Recently, flash memory has been utilized as the primary storage device in mobile devices. SSDs have been gaining popularity as the primary storage device in laptop and desktop computers and even in enterprise-level server machines. SSDs have an array of NAND flash memory packages and are therefore able to achieve concurrent parallel access to one or more flash memory packages. In order to take ...

متن کامل

Optimizing Database Operators by Exploiting Internal Parallelism of Solid State Drives

With the development of flash memory technology, flash-based solid state drives (SSDs) are gradually used in more and more devices and applications. In addition to characteristics of flash memory itself, a unique characteristic of SSDs, namely internal parallelism, should also be considered to improve performance of SSDs-based DBMSs, especially query processing. In this paper, we first describe...

متن کامل

Scan and Join Optimization by Exploiting Internal Parallelism of Flash-Based Solid State Drives

Nowadays, flash-based solid state drives (SSDs) are gradually replacing hard disk drives (HDDs) as the primary non-volatile storage in both desktop and enterprise applications because of their potential to speed up performance and reduce power consumption. However, database query processing engines are designed based on the fundamental characteristics of HDDs, so they may not benefit immediatel...

متن کامل

B+-tree Index Optimization by Exploiting Internal Parallelism of Flash-based Solid State Drives

Previous research addressed the potential problems of the harddisk oriented design of DBMSs of flashSSDs. In this paper, we focus on exploiting potential benefits of flashSSDs. First, we examine the internal parallelism issues of flashSSDs by conducting benchmarks to various flashSSDs. Then, we suggest algorithm-design principles in order to best benefit from the internal parallelism. We presen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Data Eng. Bull.

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2014